将模型参数适应传入数据流是深度学习可伸缩性的关键因素。有趣的是,在线设置中的先前持续学习策略无意中将其更新的参数锚定在本地参数子空间中,以记住旧任务,否则会偏离子空间并忘记。从这个观察结果,我们在构建多个参数模式和每个模式分配任务之间建立了权衡。模式优化的任务分配(MOTA),我们的贡献适应策略,并行训练多个模式,然后优化每个模式的任务分配。我们从经验上证明了基线连续学习策略以及各种分配变化的改进,即子人群,领域和任务转变。
translated by 谷歌翻译
NLP系统的Black-Box对抗攻击中最近的工作引起了很多关注。先前的黑框攻击假设攻击者可以根据选定的输入观察目标模型的输出标签。在这项工作中,受到对抗性转移性的启发,我们提出了一种新型的黑盒NLP对抗性攻击,攻击者可以选择相似的域并将对抗性示例转移到目标域并在目标模型中导致性能差。基于领域的适应理论,我们提出了一种称为Learn2Weight的防御策略,该策略训练以预测目标模型的重量调整,以防止对类似的对抗性示例的攻击。使用亚马逊多域情绪分类数据集,我们从经验上表明,与标准的黑盒防御方法(例如对抗性训练和防御性蒸馏)相比,Learn2Weight对攻击有效。这项工作有助于越来越多的有关机器学习安全的文献。
translated by 谷歌翻译
对多代理后门攻击的最新探索表明了反击效应,这是对后门攻击的自然防御效果,后门攻击是随机分类的。这产生了低精度W.R.T.的副作用干净的标签,激发本文在构建多代理后门防御的构建方面的工作,该防御能力最大化精确度W.R.T.清洁标签并将毒药标签的标签降至最低。我们建立在代理动力学和低损失子空间结构的基础上,我们贡献了三种防御能力,可提高多代理后门鲁棒性。
translated by 谷歌翻译
数字危害在移动生态系统中普遍存在。由于这些设备在日常生活中获得了更大的突出,因此太大了,因此增加了对个人的恶意攻击的潜力。最后一系列防御一系列数字伤害 - 包括数字分心,通过仇恨言论的政治极化,以及暴露于损坏材料的儿童 - 是用户界面。这项工作介绍了Greaeeterminator,使研究人员能够开发,部署和测试干预措施与最终用户的危害。我们展示了易于干预开发和部署,以及在五个深入案例研究中,潜在地覆盖了GreeSeterMinator的广泛危害。
translated by 谷歌翻译
Due to the high activation sparsity and use of accumulates (AC) instead of expensive multiply-and-accumulates (MAC), neuromorphic spiking neural networks (SNNs) have emerged as a promising low-power alternative to traditional DNNs for several computer vision (CV) applications. However, most existing SNNs require multiple time steps for acceptable inference accuracy, hindering real-time deployment and increasing spiking activity and, consequently, energy consumption. Recent works proposed direct encoding that directly feeds the analog pixel values in the first layer of the SNN in order to significantly reduce the number of time steps. Although the overhead for the first layer MACs with direct encoding is negligible for deep SNNs and the CV processing is efficient using SNNs, the data transfer between the image sensors and the downstream processing costs significant bandwidth and may dominate the total energy. To mitigate this concern, we propose an in-sensor computing hardware-software co-design framework for SNNs targeting image recognition tasks. Our approach reduces the bandwidth between sensing and processing by 12-96x and the resulting total energy by 2.32x compared to traditional CV processing, with a 3.8% reduction in accuracy on ImageNet.
translated by 谷歌翻译
Spiking Neural networks (SNN) have emerged as an attractive spatio-temporal computing paradigm for a wide range of low-power vision tasks. However, state-of-the-art (SOTA) SNN models either incur multiple time steps which hinder their deployment in real-time use cases or increase the training complexity significantly. To mitigate this concern, we present a training framework (from scratch) for one-time-step SNNs that uses a novel variant of the recently proposed Hoyer regularizer. We estimate the threshold of each SNN layer as the Hoyer extremum of a clipped version of its activation map, where the clipping threshold is trained using gradient descent with our Hoyer regularizer. This approach not only downscales the value of the trainable threshold, thereby emitting a large number of spikes for weight update with a limited number of iterations (due to only one time step) but also shifts the membrane potential values away from the threshold, thereby mitigating the effect of noise that can degrade the SNN accuracy. Our approach outperforms existing spiking, binary, and adder neural networks in terms of the accuracy-FLOPs trade-off for complex image recognition tasks. Downstream experiments on object detection also demonstrate the efficacy of our approach.
translated by 谷歌翻译
Solute transport in porous media is relevant to a wide range of applications in hydrogeology, geothermal energy, underground CO2 storage, and a variety of chemical engineering systems. Due to the complexity of solute transport in heterogeneous porous media, traditional solvers require high resolution meshing and are therefore expensive computationally. This study explores the application of a mesh-free method based on deep learning to accelerate the simulation of solute transport. We employ Physics-informed Neural Networks (PiNN) to solve solute transport problems in homogeneous and heterogeneous porous media governed by the advection-dispersion equation. Unlike traditional neural networks that learn from large training datasets, PiNNs only leverage the strong form mathematical models to simultaneously solve for multiple dependent or independent field variables (e.g., pressure and solute concentration fields). In this study, we construct PiNN using a periodic activation function to better represent the complex physical signals (i.e., pressure) and their derivatives (i.e., velocity). Several case studies are designed with the intention of investigating the proposed PiNN's capability to handle different degrees of complexity. A manual hyperparameter tuning method is used to find the best PiNN architecture for each test case. Point-wise error and mean square error (MSE) measures are employed to assess the performance of PiNNs' predictions against the ground truth solutions obtained analytically or numerically using the finite element method. Our findings show that the predictions of PiNN are in good agreement with the ground truth solutions while reducing computational complexity and cost by, at least, three orders of magnitude.
translated by 谷歌翻译
Motivated by mitigating potentially harmful impacts of technologies, the AI community has formulated and accepted mathematical definitions for certain pillars of accountability: e.g. privacy, fairness, and model transparency. Yet, we argue this is fundamentally misguided because these definitions are imperfect, siloed constructions of the human values they hope to proxy, while giving the guise that those values are sufficiently embedded in our technologies. Under popularized methods, tensions arise when practitioners attempt to achieve each pillar of fairness, privacy, and transparency in isolation or simultaneously. In this position paper, we push for redirection. We argue that the AI community needs to consider all the consequences of choosing certain formulations of these pillars -- not just the technical incompatibilities, but also the effects within the context of deployment. We point towards sociotechnical research for frameworks for the latter, but push for broader efforts into implementing these in practice.
translated by 谷歌翻译
Individual-level data (microdata) that characterizes a population, is essential for studying many real-world problems. However, acquiring such data is not straightforward due to cost and privacy constraints, and access is often limited to aggregated data (macro data) sources. In this study, we examine synthetic data generation as a tool to extrapolate difficult-to-obtain high-resolution data by combining information from multiple easier-to-obtain lower-resolution data sources. In particular, we introduce a framework that uses a combination of univariate and multivariate frequency tables from a given target geographical location in combination with frequency tables from other auxiliary locations to generate synthetic microdata for individuals in the target location. Our method combines the estimation of a dependency graph and conditional probabilities from the target location with the use of a Gaussian copula to leverage the available information from the auxiliary locations. We perform extensive testing on two real-world datasets and demonstrate that our approach outperforms prior approaches in preserving the overall dependency structure of the data while also satisfying the constraints defined on the different variables.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译